45 research outputs found
AMAE: Adaptation of Pre-Trained Masked Autoencoder for Dual-Distribution Anomaly Detection in Chest X-Rays
Unsupervised anomaly detection in medical images such as chest radiographs is
stepping into the spotlight as it mitigates the scarcity of the labor-intensive
and costly expert annotation of anomaly data. However, nearly all existing
methods are formulated as a one-class classification trained only on
representations from the normal class and discard a potentially significant
portion of the unlabeled data. This paper focuses on a more practical setting,
dual distribution anomaly detection for chest X-rays, using the entire training
data, including both normal and unlabeled images. Inspired by a modern
self-supervised vision transformer model trained using partial image inputs to
reconstruct missing image regions -- we propose AMAE, a two-stage algorithm for
adaptation of the pre-trained masked autoencoder (MAE). Starting from MAE
initialization, AMAE first creates synthetic anomalies from only normal
training images and trains a lightweight classifier on frozen transformer
features. Subsequently, we propose an adaptation strategy to leverage unlabeled
images containing anomalies. The adaptation scheme is accomplished by assigning
pseudo-labels to unlabeled images and using two separate MAE based modules to
model the normative and anomalous distributions of pseudo-labeled images. The
effectiveness of the proposed adaptation strategy is evaluated with different
anomaly ratios in an unlabeled training set. AMAE leads to consistent
performance gains over competing self-supervised and dual distribution anomaly
detection methods, setting the new state-of-the-art on three public chest X-ray
benchmarks: RSNA, NIH-CXR, and VinDr-CXR.Comment: To be presented at MICCAI 202
Learn to synthesize and synthesize to learn
Attribute guided face image synthesis aims to manipulate attributes on a face
image. Most existing methods for image-to-image translation can either perform
a fixed translation between any two image domains using a single attribute or
require training data with the attributes of interest for each subject.
Therefore, these methods could only train one specific model for each pair of
image domains, which limits their ability in dealing with more than two
domains. Another disadvantage of these methods is that they often suffer from
the common problem of mode collapse that degrades the quality of the generated
images. To overcome these shortcomings, we propose attribute guided face image
generation method using a single model, which is capable to synthesize multiple
photo-realistic face images conditioned on the attributes of interest. In
addition, we adopt the proposed model to increase the realism of the simulated
face images while preserving the face characteristics. Compared to existing
models, synthetic face images generated by our method present a good
photorealistic quality on several face datasets. Finally, we demonstrate that
generated facial images can be used for synthetic data augmentation, and
improve the performance of the classifier used for facial expression
recognition.Comment: Accepted to Computer Vision and Image Understanding (CVIU
Using Photorealistic Face Synthesis and Domain Adaptation to Improve Facial Expression Analysis
Cross-domain synthesizing realistic faces to learn deep models has attracted
increasing attention for facial expression analysis as it helps to improve the
performance of expression recognition accuracy despite having small number of
real training images. However, learning from synthetic face images can be
problematic due to the distribution discrepancy between low-quality synthetic
images and real face images and may not achieve the desired performance when
the learned model applies to real world scenarios. To this end, we propose a
new attribute guided face image synthesis to perform a translation between
multiple image domains using a single model. In addition, we adopt the proposed
model to learn from synthetic faces by matching the feature distributions
between different domains while preserving each domain's characteristics. We
evaluate the effectiveness of the proposed approach on several face datasets on
generating realistic face images. We demonstrate that the expression
recognition performance can be enhanced by benefiting from our face synthesis
model. Moreover, we also conduct experiments on a near-infrared dataset
containing facial expression videos of drivers to assess the performance using
in-the-wild data for driver emotion recognition.Comment: 8 pages, 8 figures, 5 tables, accepted by FG 2019. arXiv admin note:
substantial text overlap with arXiv:1905.0028
Informative sample generation using class aware generative adversarial networks for classification of chest Xrays
Training robust deep learning (DL) systems for disease detection from medical
images is challenging due to limited images covering different disease types
and severity. The problem is especially acute, where there is a severe class
imbalance. We propose an active learning (AL) framework to select most
informative samples for training our model using a Bayesian neural network.
Informative samples are then used within a novel class aware generative
adversarial network (CAGAN) to generate realistic chest xray images for data
augmentation by transferring characteristics from one class label to another.
Experiments show our proposed AL framework is able to achieve state-of-the-art
performance by using about of the full dataset, thus saving significant
time and effort over conventional methods
Adaptive Similarity Bootstrapping for Self-Distillation
Most self-supervised methods for representation learning leverage a
cross-view consistency objective i.e. they maximize the representation
similarity of a given image's augmented views. Recent work NNCLR goes beyond
the cross-view paradigm and uses positive pairs from different images obtained
via nearest neighbor bootstrapping in a contrastive setting. We empirically
show that as opposed to the contrastive learning setting which relies on
negative samples, incorporating nearest neighbor bootstrapping in a
self-distillation scheme can lead to a performance drop or even collapse. We
scrutinize the reason for this unexpected behavior and provide a solution. We
propose to adaptively bootstrap neighbors based on the estimated quality of the
latent space. We report consistent improvements compared to the naive
bootstrapping approach and the original baselines. Our approach leads to
performance improvements for various self-distillation method/backbone
combinations and standard downstream tasks. Our code will be released upon
acceptance.Comment: * denotes equal contributio
Exploring Factors for Improving Low Resolution Face Recognition
State-of-the-art deep face recognition approaches report near perfect
performance on popular benchmarks, e.g., Labeled Faces in the Wild. However,
their performance deteriorates significantly when they are applied on low
quality images, such as those acquired by surveillance cameras. A further
challenge for low resolution face recognition for surveillance applications is
the matching of recorded low resolution probe face images with high resolution
reference images, which could be the case in watchlist scenarios. In this
paper, we have addressed these problems and investigated the factors that would
contribute to the identification performance of the state-of-the-art deep face
recognition models when they are applied to low resolution face recognition
under mismatched conditions. We have observed that the following factors affect
performance in a positive way: appearance variety and resolution distribution
of the training dataset, resolution matching between the gallery and probe
images, and the amount of information included in the probe images. By
leveraging this information, we have utilized deep face models trained on
MS-Celeb-1M and fine-tuned on VGGFace2 dataset and achieved state-of-the-art
accuracies on the SCFace and ICB-RW benchmarks, even without using any training
data from the datasets of these benchmarks.Comment: CVPR Workshop on Biometrics 201